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PROGRESS  REPORT  ON  CONTRACT  N00014-86-K-0217 
PRINCIPAL  INVESTIGATOR:  Richard  A.  Laursen 
CONTRACTOR:  Boston  University 

CONTRACT  TITLE:  Characterization  of  Marine  Bioadhesive  Proteins 
START  DATE:  1  April  1986 

RESEARCH  OBJECTIVE:  The  primary  initial  objective  has  been  to  clone  and 
sequence  adhesive  protein  genes  for  several  species  of  mussel  with  the  aim  of 
understanding  what  common  (if  any)  structural  features  give  these  proteins 
their  adhesive  properties.  It  is  hoped  that  this  knowledge  will  lead  to  the 
development  of  adhesives  that  will  have  medical  and  other  applications. 

PROGRESS  (YEAR  2) :  During  the  first  year  and  continuing  into  the  second,  the 
focus  of  our  work  was  isolating  and  sequencing  several  cDNA  clones  of 
fragments  of  the  adhesive  protein  gene.  This  work  showed  that  adhesive 
protein  of  M.  edulis  is  primarily  repeats  of  the  decapeptide 

xxl-Lys-xx2-xx3-Tyr-Pro-Pro-Thr-Tyr-Lys 

where  xxl  is  usually  Pro,  Ser  or  Ala;  xx2  is  Pro,  Ser,  Leu,  lie  or  Lys;  and 
xx3  is  Thr  or  Ser.  Using  our  original  methods,  however,  we  have  not  been  able 
to  obtain  a  clone  or  set  of  overlapping  clones  that  encode  for  the  entire 
protein.  It  appears  that  recombination,  due  to  the  repetitive  nature  of  the 
gene,  is  occurring  during  cloning.  Recently  we  have  tried  to  overcome  this 
problem  we  have  fractionated  our  cDNA  library,  selected  a  fraction  (3.3  kbp) 
large  enough  to  code  for  the  entire  protein  and  are  carrying  out  the  subse¬ 
quent  cloning  steps  in  recombinant-minus  host  strains.  We  have  also  isolated 
M.  edulis  genomic  DNA  and  are  currently  screening  the  genomic  library. 

In  year  2  we  have  also  obtained  sequence  data  from  two  other  species  of 
mussel,  Mytilus  calif oraianus  and  Geukeasia  demissus,  and  we  will  soon  have 
data  from  Modiolus  modiolus .  Cloning  of  M.  calif  oraianus  and  of  M.  modiolus 
genes  was  carried  out  as  for  M.  edulis  by  construction  of  a  XgtlO  cDNA  library 
and  screening  with  probes  from  M.  edulis.  The  sequence  of  a  clone  from  M. 
calif  oraianus  was  very  simililar  to  that  of  M.  edulis,  except  for  the 
occurence  of  Arg  (50$  of  the  time)  at  position  xxl  and  about  a  50$  occurence 
of  Ser  and  Ala  at  position  xx7. 

Cloning  the  G.  demissa  gene:  A  XgtlO  library  was  initially  constructed 

for  this  species,  but  screening  with  Mytilus  probes  was  unsuccessful  because 
(as  we  now  know)  of  the  significant  sequence  differences.  For  this  reason, 
mRNA  was  isolated  as  usual  and  transcribed  with  reverse  transcriptase  to  make 
cDNAs,  which  were  then  cloned  into  the  LacZ  gene  the  Xgtll  expression  vector. 
G.  demissa  adhesive  protein  was  also  isolated  by  extraction  of  phenol  glands 
and  purification  by  acid  polyacrylamide  gel  electrophoresis.  The  protein 
containing  band  was  excised  and  used  directly  to  immunize  a  rabbit  as  a  source 
of  polyclonal  antibodies. 
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The  Xgtll  library  was  packaged  into  phage  and  used  to  infect  E.  coli  Y1090 
host  cells,  and  /?-galactosidase  fusion  products  were  detected  as  colorless 
plaques  among  blue  plaques  in  nonrecombinants.  Colorless  plaques  were 
replated  and  screened  with  adhesive  protein  antibodies  and  an  alkaline 
phosphatase  linked  secondary  antibody,  which  gave  a  blue  color  with  the 
substrate  BCIP  (5-bromo-4-chlor-3-indolylphosphate-p-toluidide) .  Positive 
clones  were  then  sequenced  in  the  usual  manner.  The  G.  deaissa  protein  is 
signif icalntly  different  in  that  it  contains  repeats  of  from  11  to  13  amino 
acids,  e.g., 

Gly-Lys-Pro-Thr-Thr-Tyr-Asp-Ala-Gly-Tyr-Lys- 
Gly-Gln-Gln-Lys-Gln-Thr-Gly-Tyr-Asp-Thr-Gly-Tyr-Lys- , 

and  contains  large  amounts  of  glycine  and  glutamine,  but  little  proline. 
Genetic  material  for  this  species  was  obtained  by  immunoscreening  a  Xgtll  cDNA 
library,  because,  in  contrast  with  the  other  species,  we  had  no  protein 
sequence  data  to  guide  the  synthesis  of  oligonucleotide  probes. 

The  recombination  problem:  Northern  blot  experiments  in  which  a  32p_ 
labeled  oligonucleotide  probe  is  allowed  to  hybridize  with  mRNA  have 
consistently  shown,  for  all  species,  that  the  mRNA  we  have  isolated  from 
mussel  phenol  glands  is  long  enough  (3. 0-4.0  kbp)  to  code  for  an  adhesive 
protein  with  a  molecular  weight  of  up  to  130,000.  However  screening  of  cDNA 
libraries  shows  not  only  many  fewer  clones  than  we  might  expect,  but  also  much 
smaller,  typically  less  than  500  base  pairs,  than  the  3500  bp  needed  to  code 
for  the  entire  protein.  This  suggests  that  during  the  cloning  process  large 
amounts  of  information  are  being  recombined  out,  even  though  we  are  using 
RecA-  host  strains.  Furthermore,  the  fragments  we  have  sequenced,  and  also 
those  sequenced  at  Genex  Corp.  (unpublished),  do  not  overlap,  despite  the  fact 
that  we  have  enough  data  to  account  for  more  than  the  entire  protein.  So  the 
situation  may  be  even  worse  than  loss  of  information,  there  may  also  be 
scrambling. 

Other  workers  have  also  encountered  difficulty  in  cloning  repetitious  DNA 
sequences  in  certain  E.  coli  strains.  To  overcome  this  problem,  we  are  now 
beginning  to  carry  out  cloning  operations  in  recominant-def icient  hosts  (RecA~ 
and/or  RecBC") . 


A  conformational  model  for  the  Mytilus  protein:  Because  of  the 
invariability  of  Tyr  and  Lys  residues  and  the  patterns  of  posttranslational 
modification  of  Tyr  and  Pro  residues,  we  believe  that  the  adhesive  proteins 
probably  have  some  sort  of  regular,  as  opposed  to  a  "random  coil",  structure. 
Given  the  large  amount  of  proline,  a  structure  with  turns  or  loops  seems  more 
likely  than  a  regular  helical  or  sheet  structure.  Given  the  propensity  for 
Tyr  and  Thr  residues  to  occur  in  /7-sheets,  for  Pro-Pro  sequences  not  to  be 
found  in  /?- turns,  but  to  cause  a  90°  bend  in  the  peptide  backbone,  we  have 
postulated  the  following  /J-sheet-/7-turn  model  to  serve  as  a  working  hypothesis 
for  planned  spectroscopic  studies: 
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This  model,  though  speculative,  has  some  attractive  features.  It  puts  all 
the  polar  groups  on  the  faces  of  the  ^-sheet  loop,  where  they  could  interact 
with  surfaces.  In  addition  the  Tyr  and  Lys  residues  are  on  both  faces  in 
pairs,  in  a  symmetrical  arrangement,  where  they  might  pair  up  with 
corresponding  pairs  in  another  chain  to  form  interchain  crosslinks,  the  major 
failing  of  this  model  is  that  one  cannot  make  a  similar  model  for  the 
Geukensia  protein,  which  contains  little  proline  and  has  a  less  regular  repeat 
structure.  Of  course  Geukensia  could  have  a  completely  different  structure, 
but  one  would  think,  given  the  relatively  constant  placement  of  the  critical 
Tyr  and  Lys  residues,  that  there  might  be  some  conformational  similarities. 
The  answer  to  this  dilemma  can  be  answered  only  by  experiment. 


WORK  PLAN  (YEAR  3)  In  year  three  we  plan  to  concentrate  on  obtaining  the 
entire  sequence  of  an  adhesive  protein  either  by  sequencing  genomic  DNA  or 
through  the  use  of  the  rec~  cloning  strains  mentioned  above.  Even  if  that 
fails,  we  now  have  or  soon  will  have  sufficient  sequence  data  to  begin 
analyzing  the  problem  of  what  gives  this  class  of  proteins  their  adhesive 
character.  During  the  next  year  we  will  focus  on  obtaining  more  sequence  data 
on  the  G.  demissa  protein,  because  it  is  so  different  from  the  other  species, 
and  on  getting  sequence  data  from  If.  modiolus. 

We  plan  also  to  begin  conformational  and  modeling  studies,  using  high 
resolution  NMR  techniques,  on  the  proteins  or  peptide  models,  since  it  seems 
likely  to  us  that  these  proteins  have  some  sort  of  regular  structure.  If 
time  permits,  we  hope  to  characterize  the  crosslink,  which  is  presumed  to 
occur  between  lysine  and  DOPA  residues  in  these  proteins,  using  chemical  and 
mass  spectrometric  methods. 


